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Abstract 

Twenty-six Head Start preschool classrooms participated in a yearlong intervention designed to link the Head Start Child 
Outcomes Framework with authentic assessment practices. Teachers in intervention and pilot classrooms implemented an 
assessment approach that incorporated the use of a curriculum-based assessment tool, the development of portfolios aligned 
with the mandated Head Start Child Outcomes, and the integration of this child assessment information into individual and 
classroom instructional planning. During the intervention period, comparison classrooms continued to use the assessment 
approach adopted by the local Head Start program, which included the use of a standardized assessment tool and the use of 
an agency-developed lesson plan form. Intervention and pilot classrooms demonstrated significant improvements on some 
dimensions of classroom quality as measured by the Early Language and Literacy Classroom Observation (ELLCO) toolkit, 
whereas comparison classrooms exhibited no change in classroom quality. Implications for practice are discussed. 


Introduction 

The application of standards to educational programs as a measure of accountability has become 
commonplace (National Child Care Information Center, 2006; Scott-Little, Kagan, & Frelow, 
2003). In the field of early care and education, this emphasis on standards is often viewed as 
counter to developmentally appropriate practice and can misguide programs to engage in 
assessment practices that are not recommended for young children (Meisels, 2000; Neisworth & 
Bagnato, 2004). This study describes a federally funded project that utilizes the Head Start Child 
Outcomes Framework as the basis for appropriate authentic assessment practices integrated into 
instructional planning for young children. This model of outcomes-driven authentic assessment 
linked to classroom instruction is then examined to determine its effect on classroom quality in 
preschool programs. 

The increasing emphasis on accountability in early care and education programs has illuminated 
the need for rethinking assessment systems within the field of early childhood education. 
Increasingly, states and early education entities are developing child standards or child 
outcomes for children birth through 5 years of age. Head Start specifically developed their Child 
Outcomes Framework in 2000, outlining the expected outcomes for 4-year-olds as they exit the 
program (U.S. Department of Health and Human Services, 2003). In response to the Good Start, 
Grow Smart initiative, states are developing their own set of standards/outcomes for children. 
The National Child Care Information Center (NCCIC) lists 42 states currently holding or 
developing early learning guidelines (NCCIC, 2006). A joint position statement of the National 
Association for the Education of Young Children and the National Association of Early Childhood 
Specialists in State Departments of Education (NAEYC & NAECS/SDE, 2003) cites both risks and 
benefits of such early learning standards/outcomes. The potential pitfalls of articulating child 
standards/outcomes include the negative impact on curriculum and a narrowing of focus on early 
education activities. However, benefits may also result in that standards may help teachers and 
programs develop clearer expectations for curriculum and learning goals, facilitate continuity 
across grade levels, and highlight ways to support children with special needs. 

The primary challenge in applying early learning standards or child outcomes to early care and 
education programs is the potential disconnect between outcomes and appropriate assessment 
processes for gathering information that can be used at the programmatic level. Head Start, in 




particular, has implemented a National Reporting System that uses a standardized assessment 
process to document child outcomes (Rothman, 2005). Furthermore, the paucity of appropriate 
assessment tools for young children creates a dilemma in the implementation of a standards- 
based approach in early care and education. However, the assessment literature in early 
childhood education underscores the difficulty in obtaining reliable and valid information from 
young children in standardized assessment formats (Bagnato & Neisworth, 1995; Rafoth, 1997). 
The appropriate use of assessment tools is also a concern. Many early care and education 
programs use standardized diagnostic tools for purposes of instructional planning rather than for 
their intended clinical purpose (Rafoth, 1997). Taken together, these issues emphasize the 
difficulty in addressing the accountability mandates in the field of early care and education. 

Historically, the field of early childhood education has emphasized naturalistic assessment 
strategies, such as observation and parent interview, as the most appropriate ways to gather 
meaningful assessment information for young children. Current recommended practices in both 
early childhood education and early childhood special education focus on authentic assessment 
approaches. Both the National Association for the Education of Young Children (Bredekamp & 
Copple, 1997) and the Council for Exceptional Children's Division for Early Childhood (Sandall, 
McLean, & Smith, 2000) have established guidelines for appropriate assessment practices for 
young children. These guidelines point to the need for assessment approaches that are 
developmentally appropriate in terms of the purposes, content, and methods that are used. 

When assessment is being conducted to support program planning, it should be authentic in that 
it is ongoing, is conducted in the children's natural contexts, and provides information that is 
useful in planning for each child. 

However, very little empirical research has been conducted on authentic assessment processes. 
The work of Meisels and colleagues is the exception. Meisels, Liaw, Dorfman, and Nelson (1995) 
found moderate to high levels of reliability and high predictive validity in the developmental 
checklist of the Work Sampling System. Additionally, a recent study suggests positive impacts 
on children's achievement scores in reading and math when teachers use a curriculum-embedded 
performance assessment system (Meisels, Atkins-Burnett, Xue, Nicholson, Bickel, & Son, 2003). 

The intervention approach in this study relied on the use of authentic assessment approaches 
aligned with the Head Start Child Outcomes Framework. Specifically, Project LINK (A Partnership 
to Promote LINKages among Assessment, Curriculum, and Outcomes in Order to Enhance School 
Success for Children in Head Start Programs ) was a federally funded project that utilized 
recommended practices in early childhood assessment as a means for documenting 
accountability. The Assessment, Evaluation, and Planning System (AEPS) for birth to 3 years and 
3 to 6 years (Bricker, 2002) was used in the fall and spring to document children's 
developmental progress. The AEPS is a curriculum-based assessment tool designed to assess 
children's development and learning across six developmental areas: gross motor, fine motor, 
social communication, communication, adaptive, and cognitive (Bricker, 2002). In the Project 
LINK model, the implementation of the AEPS was guided by the use of activity-based protocols. 
Specifically, six activity-based protocols were developed that, combined with a parent interview 
and social-communication child observation, complete the full battery of the AEPS. The 
information gathered from the AEPS was then used to develop individualized child plans for all 
children enrolled in the Head Start classrooms. After the development of individualized plans, 
teachers used child assessment data to guide curriculum planning in the classroom (see Figure 
1). Additionally, portfolios were developed for all of the children, guided by the mandated Head 
Start Child Outcomes as well as their individualized goals. 




Figure 1. Project LINK Intervention Model. 


Comparison classrooms in this study followed current agency procedures for child assessment 
and curriculum planning. Specifically, classroom teachers used the Learning Accomplishment 
Profile-Diagnostic (LAP-D, 1992) to collect child assessment data three times a year. 
Additionally, teachers developed weekly lesson plans using the agency lesson planning format 
and collected anecdotal observation data on each child in the classroom. Agency policy required 
teachers to collect one anecdote in each developmental area per child per week. 

Conceptually, Project LINK was designed to be a two-year intervention with a two-year 
evaluation plan; the first year involved examining classroom quality, and the second year 
involved examining standardized child outcomes and classroom quality. This study outlines the 
preliminary findings from the pilot and second year of intervention from Project LINK, examining 
the effects of an outcomes-driven authentic assessment process on classroom quality. 

Method 


Participants 

Project LINK was developed and implemented in partnership with one large multicounty Head 
Start program consisting of 28 direct-managed preschool classrooms. During the project design, 
it was determined that first-year teachers would not be included in the project implementation. 
Thus, only 26 classrooms participated in the project. Given that the classrooms were distributed 
over multiple sites, classrooms were selected by site to avoid spillover effects. The 26 
participating classrooms represented 13 different sites. During the first year of the project, pilot 
sites were selected in partnership with the administration of the Head Start program. Eight 
classrooms were selected by Head Start to participate in the pilot portion of the intervention to 
refine intervention processes and inform model development. The remaining 18 direct-managed 
preschool classrooms in this Head Start grantee were randomly assigned to intervention and 
comparison groups by location site and stratified by metropolitan status (urban/rural). The pilot 
classrooms received the intervention during the pilot year and the subsequent intervention year. 
During this two-year period, no changes in lead teachers occurred in these classrooms. The 
intervention group only participated during the targeted second year of intervention. The data 
presented in Table 1 reflect information collected during the target intervention year. 


Table 1 

Descriptive Data for Lead Teacher Background Characteristics 


Variable 

Pilot 

(n = 8) 

Intervention 

(n = 9) 

Comparison 
(n = 9) 

Race 

African American 

5 (62.5) 

2 (22.2) 

2 (22.2) 






















Caucasian 

3 (37.5) 

7 (77.8) 

7 (77.8) 

Education Level 

High School 

1 (12.5) 


1 (11.1) 

AA 

2 (25) 

3 (33.3) 

4 (44.4) 

BA 

5 (62.5) 

5 (55.6) 

4 (44.4) 

MA 


1 (11.1) 


Years of Experience 

1-2 years 




3-5 years 

2 (25) 

1 (11.1) 

3 (33.3) 

6-10 years 

3 (37.5) 

5 (55.6) 

3 (33.3) 

More than 10 years 

3 (37.5) 

3 (33.3) 

3 (33.3) 


Description of the Intervention 

Lead teachers, assistant teachers, and children's services coordinators (on-site program 
managers) attended two days of formal training on the Project LINK model at the beginning of 
the school year. Training was followed by weekly technical assistance visits throughout the year. 
The content of the training sessions was designed and delivered by the principal investigators 
and a specialist in preschool portfolio development. Teachers received instruction and practice 
on use of the AEPS through the activity-based protocols designed specifically for Project LINK. 
Teachers were also trained to interpret and utilize AEPS assessment results for developing 
children's individualized learning plans. Training on the use of a project-specific lesson plan form 
involved a process of connecting individual assessment results with learning objectives from the 
Head Start Outcomes Framework. Additionally, teachers were trained to develop a portfolio 
system for documenting children's ongoing progress toward individualized goals and the Head 
Start mandated outcomes. 

Technical assistance was provided by project staff, all of whom were graduate students in early 
childhood education. Weekly visits consisted of a variety of supports, including observation and 
feedback, provision of materials to support implementation of the model, assistance with 
technology, teacher curriculum resources, and troubleshooting. Visits lasted approximately one 
hour each week and varied according to the type of assistance provided. Although a range of 
assistance options were provided to all teachers, the level of help and content of the visits were 
highly individualized. Teachers with more background in child development and whose prior 
teaching more closely resembled Project LINK elements may have received more assistance with 
resources, feedback, and technology, for example; while other teachers with less experience or 
background knowledge received more direct modeling, observation, and guidance on use of the 
multiple elements of assessment, lesson planning, and individualization. 

All teachers were able to reach at least adequate levels of implementation with the model. AEPS 
assessments were completed for each child in the fall and spring using activity-based protocols. 
Individualized learning plans were developed for each child in the classrooms and updated or 
monitored on a regular basis. Group lesson plans were completed every week, and individual 
portfolios were created for each child in the classroom to collect evidence of progress throughout 
the school year. 

Measures 


Classroom data were collected using the Early Childhood Environment Rating Scale-Revised 
Edition (ECERS-R, Harms, Clifford, & Cryer, 1998) and the Early Language and Literacy 
Classroom Observation Toolkit (ELLCO, Smith & Dickinson, 2002). Inter-rater reliability was 



established at 86.72% reliability at the .60 level for the ECERS-R and 100% reliability at a 
kappa of .60 for the ELLCO. Descriptions of each classroom measure are outlined below. 

ECERS-R (Harms, Clifford, & Cryer, 1998) is a widely used program quality measure designed to 
assess group programs for children of preschool through kindergarten age, 272 through 5. The 
scale consists of 43 items organized in 7 subscales (space and furnishings, personal care 
routines, language-reasoning, activities, interactions, program structure, and parents and staff). 
The subscale internal consistencies for ECERS-R range from .71 to .88, and the total scale 
internal consistency is .92 (Harms, Clifford & Cryer, 1998). 

ELLCO (Smith & Dickinson, 2002) is a comprehensive set of observation tools designed to 
describe the extent to which classrooms provide children optimal support for their language and 
literacy development. The complete toolkit includes three independent research tools (literacy 
environment checklist, classroom observation, and literacy activities rating scale). The reliability 
and validity of the three independent tools have been examined. The Cronbach's alpha of .84 for 
the literacy environment checklist, of .90 for the classroom observation, and of .66 for the 
literacy activities rating scale show acceptable to good internal consistency (Smith & Dickinson, 
2002 ). 

Procedures for Data Collection 

Intervention, pilot, and comparison classrooms were observed at the beginning and end of the 
2004-2005 year, during scheduled observation times. In the case of the pilot group, this was the 
second year of implementation of the Project LINK model, given that they had participated on a 
pilot basis the prior year. The data were collected by seven master's-level and doctoral-level 
graduate students trained in the implementation of the measures. The ECERS-R data (Harms, 
Clifford, & Cryer, 1998) and ELLCO data (Smith & Dickinson, 2002) were collected during the 
same observation period, which lasted from approximately two to four hours. Data collectors 
scheduled their observations with teachers prior to data collection. Data were entered into SPSS 
12.0 for analysis. 


Results 

Descriptive analyses were conducted for both the ECERS-R (Harms et al., 1998) and the ELLCO 
(Smith & Dickinson, 2002) for each of the three groups. Table 2 outlines the means and standard 
deviations for the pretest and posttest scores for the ECERS-R composite and the three ELLCO 
scales for each of the three groups (intervention, pilot, and comparison). Change scores were 
then calculated for each of the three groups on all subscale measures (Table 3). ANOVAS were 
then calculated to examine differences in change scores among the three groups. No statistically 
significant differences were found relative to the ECERS-R. However, differences in the quality of 
the language and literacy environment as measured by the ELLCO were found. Specifically, 
statistically significant differences were found between change scores on the ELLCO Literacy 
Environment Checklist, F( 2, 23) = 4.82, p < .05, and the ELLCO Classroom Observation, F( 2, 23) 
= 10.10, p < .01. Posthoc analysis using Scheffe indicated that change scores for the pilot group 
improved more than the comparison group on the ELLCO Literacy Environment Checklist. The 
pilot group also improved more on the ELLCO Classroom Observation than both intervention and 
comparison groups. The intervention group improved more than the comparison group on the 
ELLCO Classroom Observation. 


Table 2 


Comparison of Pretest and Posttest Means for Classroom Quality 


Variable 

Pretest Mean M ( SD ) 

Posttest Mean M ( SD ) 

Pilot Group (n = 8) 

ELLCO - Literacy Environment Checklist 

22.38 (6.70) 

33.50 (7.21) 










ELLCO - Classroom Observation 

45.00 (5.78) 

57.25 (4.89) 

ELLCO - Literacy Activity Rating Scale 

3.75 (2.61) 

6.13 (3.18) 

ECERS-R Composite 

4.57 (.74) 

5.28 (.68) 

Intervention Group (n = 9) 

ELLCO - Literacy Environment Checklist 

27.56 (10.26) 

29.44 (7.50) 

ELLCO - Classroom Observation 

47.89 (11.68) 

56.11 (6.57) 

ELLCO - Literacy Activity Rating Scale 

6.11 (2.21) 

7.78 (2.39) 

ECERS-R Composite 

4.81 (.43) 

5.24 (.47) 

Comparison Group (n = 9) 

ELLCO - Literacy Environment Checklist 

28.78 (7.81) 

26.56 (9.61) 

ELLCO - Classroom Observation 

50.89 (5.26) 

48 (8.70) 

ELLCO - Literacy Activity Rating Scale 

6.22 (2.05) 

5.22 (2.22) 

ECERS-R Composite 

4.43 (.45) 

4.97 (.28) 


Table 3 


Mean Change Scores for Classroom Quality 


Variable 

Intervention M 

(SD) 

Pilot M 

(SD) 

Comparison M 
( SD ) 

F 

ELLCO Literacy Environment Checklist 

1.89 (6.37) 

11.13 (8.32) 

-2.22 (11.49) 

4.82* 

ELLCO Classroom Observation 

8.22 (7.89) 

12.25 (6.25) 

-2.89 (7.42) 

10.10 ** 

ELLCO Literacy Activity 

1.67 (3.35) 

2.38 (4.17) 

-1.0 (2.18) 

2.54 

ECERS-R Composite 

.44 (.66) 

.71 (.85) 

.54 (.64) 

.32 


*p < .05. 

**p < .01. 


Discussion 

The infusion of standards into early education programs requires thoughtful planning and 
reflection on an array of program practices, including child assessment, curriculum planning and 
implementation, as well as data reporting. The model described in this study is one that allows 
recommended practices in child assessment to guide these processes while still addressing 
accountability standards. However, the intent of this study was to examine the impact of an 
outcomes-driven authentic assessment model on classroom quality and in particular language 
and literacy environments given the increased focus on language and literacy instruction and the 
requirements for this emphasis as mandated by the Head Start Child Outcomes Framework. 

Findings from this study suggest that an authentic assessment approach may have a positive 
impact on the language and literacy environment. Differences were found in both the pilot and 
intervention groups on the ELLCO (Smith & Dickinson, 2002) classroom observation. However, 
significant differences between pre- and post-observations were not detected on the ECERS-R. It 
should be noted that all mean ECERS-R scores were below 5.0 (indicative of "good" quality) 
initially and that the pilot and intervention groups ended the year about 5.0 (5.28 and 5.24, 
respectively), while the comparison group ended with a 4.97. Given that the ECERS-R is a 
measure of program quality with an emphasis on structural quality, it is not surprising that 
differences were not found. 


The ECERS-R does not focus on instructional quality and has little emphasis on literacy 
































































instruction in particular (Dickinson, 2002; Stipek & Byler, 2004). Results on the pre/post ELLCO 
scores suggests that providing a focused intervention on child assessment that is linked to 
standards and/ora particular content area (such as language and literacy) may result in 
improved instruction in that area. This improvement in quality related to the use of authentic 
assessment is consistent with the findings of Meisels et al. (2003), who found that the use of a 
performance-based curriculum-embedded assessment approach improved child outcomes in 
primary-age children as measured by standardized achievement test scores. More research is 
needed in this area to examine how the use of authentic assessment approaches influences 
teachers' planning of the early childhood curriculum as well as the subsequent impact on child 
outcomes. 

Results from this study should be interpreted cautiously. Data were collected from one large 
Head Start grantee and therefore cannot be generalized to other types of early care and 
education programs. Moreover, the pilot and intervention groups received different amounts of 
exposure to the intervention. Specifically, the pilot programs had participated for two years in 
the intervention, while the intervention group had only participated for one year. This 
discrepancy may explain why the data indicated that the pilot group made more gains than the 
intervention group. The pilot group had more time to digest, learn about, and implement the 
Project LINK model. Additional study of the impact of authentic assessment on classroom quality 
needs to be conducted with larger and more diverse samples of programs, teachers, and 
children. Moreover, because of limited project resources, graduate students were not able to 
remain blind to treatment groups. Despite these limitations, study results suggest that the use of 
authentic assessment in early education classrooms may provide an important link to improving 
classroom quality and curriculum planning. 

As the accountability movement unfolds and influences early care and education programs, the 
potential value of authentic assessment approaches should be systematically examined. Such 
approaches offer early education programs a means to implement recommended practices in 
child assessment while continuing to address the growing need to document child outcomes. 
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